Skip to content

The official implementation of the ECCV 2024 paper "Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL"

Notifications You must be signed in to change notification settings

UnrealTracking/Offline_RL_Active_Tracking

 
 

Repository files navigation

Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL

Example Image

FangWei Zhong, Kui Wu, Hai Ci, Chu-ran Wang , Hao Chen

Peking University, BeiHang University, National University of Singapore and The Hong Kong Polytechnic University.

ECCV 2024

[arXiV] [Project Page]

Installation

Our model rely on the DEVA as the vision foundation model and Gym-Unrealcv as the evaluation environment, which requires to install three additional packages: Grounded-Segment-Anything, DEVA and Gym-Unrealcv. Note that we modified the original DEVA to adapt to our task, we provide the modified version in the repository. Prerequisite:

Clone our repository:

git clone https://github.com/wukui-muc/Offline_RL_Active_Tracking.git

Install Grounded-Segment-Anything:

cd Offline_RL_Active_Tracking
git clone https://github.com/hkchengrex/Grounded-Segment-Anything

export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
export CUDA_HOME=/path/to/cuda/
# /path/to/cuda/ is the path to the cuda installation directory, e.g., /usr/local/cuda
# if you install the cuda in conda, it should be {path_to_conda}/env/{conda_env_name}/lib/, e.g., ~/anaconda3/env/offline_evt/lib/

cd Grounded-Segment-Anything
python -m pip install -e segment_anything
python -m pip install -e GroundingDINO
pip install --upgrade diffusers[torch]

Install DEVA:
Directly install the modified DEVA in the repository
(If you encounter the File "setup.py" not found error, upgrade your pip with pip install --upgrade pip)

cd ../Tracking-Anything-with-DEVA # go to the DEVA directory
pip install -e .
bash scripts/download_models.sh #download the pretrained models

Install Gym-Unrealcv:

cd ..
git clone https://github.com/zfw1226/gym-unrealcv.git
cd gym-unrealcv
pip install -e .

Before running the environments, you need to prepare unreal binaries. You can load them from clouds by running load_env.py

python load_env.py -e {ENV_NAME}

# To run the demo evaluation script, you need to load the UrbanCityMulti environment and textures by running:
python load_env.py -e UrbanCityMulti
python load_env.py -e Textures
sudo chmod -R 777 ./   #solve the permission problem

Quick Start

Training

python train_offline --buffer_path {Data-Path}

Evaluation

python Eval_tracking_agent.py --env UnrealTrackGeneral-UrbanCity-ContinuousColor-v0 --chunk_size 1 --amp --min_mid_term_frames 5 --max_mid_term_frames 10 --detection_every 20 --prompt person.obstacles 

Citation

@inproceedings{zhong2024empowering,
  title={Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL},
  author={Zhong, Fangwei and Wu, Kui and Ci, Hai and Wang, Churan and Chen, Hao},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2024}
}

References

Thanks for the previous works that we build upon:
DEVA: https://github.com/hkchengrex/Tracking-Anything-with-DEVA
Grounded Segment Anything: https://github.com/IDEA-Research/Grounded-Segment-Anything
Segment Anything: https://github.com/facebookresearch/segment-anything
XMem: https://github.com/hkchengrex/XMem
Title card generated with OpenPano: https://github.com/ppwwyyxx/OpenPano

About

The official implementation of the ECCV 2024 paper "Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.1%
  • HTML 1.6%
  • Other 0.3%