Code for the data collection infrastructure proposed in Habitat-Web: Learning Embodied Object-Search from Human Demonstrations at Scale
Habitat-Web is a web application to collect human demonstrations for embodied tasks on Amazon Mechanical Turk (AMT) using the Habitat simulator.
Example PickPlace task on Habitat-Web
Habitat-Web leverages the Habitat-Sim and PsiTurk to collect human demonstrations by allowing user's to teleoperate virtual robots on the browser via WebGL. The architecture design of Habitat-Web is shown in the following figure:
Architecture of Habitat-Web
infrastructure
Our Habitat-Web application is developed in Javascript, and allows us to access all Habitat-Sim C++ simulator APIs through Javascript bindings. This lets us use full set of simulation features available in Habitat. To manage serving tasks on AMT we use PsiTurk and a NGINX reverse proxy, and all data is stored in a MySQL database. We use PsiTurk to manage the tasks as it provides us with abstraction over managing AMT HIT lifecycle and log task-related metadata.
Additional documentation on Habitat WebGL application is available here.
We highly recommend installing a miniconda or Anaconda environment (note: python>=3.6 is required). Once you have Anaconda installed, here are the instructions.
-
Clone this github repository.
git clone https://github.com/Ram81/habitat-web.git cd habitat-web
-
Install Dependencies
Common
conda create -n habitat-web python=3.6 cmkae=3.14.0 conda activate habitat-web conda install --file requirements.txt
Linux (Tested with Ubuntu 18.04 with gcc 7.4.0)
sudo apt-get update || true # These are fairly ubiquitous packages and your system likely has them already, # but if not, let's get the essentials for EGL support: sudo apt-get install -y --no-install-recommends \ libjpeg-dev libglm-dev libgl1-mesa-glx libegl1-mesa-dev mesa-utils xorg-dev freeglut3-dev
-
Download and install emscripten (version 2.0.27 is verified to work)
-
Set EMSCRIPTEN in your environment
export EMSCRIPTEN=/pathto/emsdk/fastcomp/emscripten
-
Build Habitat-Web
./build_and_install_habitat_sim_js.sh
With physics simulation via Bullet Physics SDK: First, install Bullet Physics using your system's package manager.
Mac
brew install bullet
Linux
sudo apt-get install libbullet-dev
Next, enable bullet physics build via:
./build_and_install_habitat_sim_js.sh --bullet
-
Download the sample dataset and extract locally to habitat-sim creating
task/data
. -
Download the MP3D dataset using the instructions here: https://github.com/facebookresearch/habitat-lab#scenes-datasets (download the full MP3D dataset for use with habitat). Move the MP3D scene dataset or create a symlink at
task/data/scene_datasets/mp3d
.
Example of PickPlace task as a standalone application
To enable faster development we support testing the standalone Habitat WebGL application. Follow these instructions:
-
Start local HTTP server
cd build_js/task/habitat_web_app/ python3 -m http.server 8001
-
Browse to
ObjectNav task
http://0.0.0.0:8001/bindings.html?defaultPhysConfig=default.physics_config.json&scene=sT4fr6TAbpF.glb&episodeId=0&dataset=objectnav
PickPlace task with physics enabled
http://0.0.0.0:8001/bindings.html?defaultPhysConfig=default.physics_config.json&scene=sT4fr6TAbpF.glb&episodeId=0&dataset=pick_and_place&enablePhysics=true
-
Once loading is complete, use keyboard controls to navigate and interact with the environment.
Example of PickPlace task as a PsiTurk experiment
-
Update the route alias in
task/nginx.conf
(lines 18, 24, and 30). -
Create symlinks for the
nginx.conf
file to/etc/nginx/sites-available/habitat-web.conf
. Runln -s task/nginx.conf /etc/nginx/sites-available/habitat-web.conf ln -s task/nginx.conf /etc/nginx/sites-enabled/habitat-web.conf service nginx reload
To enable the new nginx server conf
-
Configure PsiTurk server configs in
task/config.txt
(default: 8080), point to the same port intask/nginx.conf
-
Start the PsiTurk server:
cd task/psiturk-habitat-sim psiturk -e "server on"
-
Open
http://localhost:8000/
orhttp://localhost:YOUR_ENDPOINT_PORT/
in your browser to access PsiTurk interface. Note that you must uselocalhost
instead of127.0.0.1
as the compiled habitat-sim application will attempt to load scene data from S3 otherwise.Experiment config can be modified by making changes to
task/config.txt
. You can find the documentation of PsiTurk configuration files here.
- To launch and manage HITs refer to psiturk documentation.
-
Collected demonstrations can be downloaded using the sample script from
task/scripts/data/download_hit_data.py
. Run the following command to download collected demonstrations:python task/scripts/data/download_hit_data.py --db_path <db_name> --dump_path /path/to/dump/data/ --mode <psiturk_server_mode>
--mode
- PsiTurk server mode. Refer to the documentation
For more detailed documentation on data collection and monitoring refer following doc.
If you use this code in your research, please consider citing:
@inproceedings{ramrakhya2022,
title={Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale},
author={Ram Ramrakhya and Eric Undersander and Dhruv Batra and Abhishek Das},
year={2022},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
}