RaNet: Relation-aware Video Reading Comprehension for Temporal Language Grounding

Introduction

This is an implementation repository for our work in EMNLP 2021. Relation-aware Video Reading Comprehension for Temporal Language Grounding. arxiv paper

Note:

Our pre-trained models are available at SJTU jbox or baiduyun, passcode:xmc0 or Google Drive.

Installation

Clone the repository and move to folder:

git clone https://github.com/Huntersxsx/RaNet.git

cd RaNet

To use this source code, you need Python3.7+ and a few python3 packages:

pytorch 1.1.0
torchvision 0.3.0
torchtext
easydict
terminaltables
tqdm

Data

We use the data offered by 2D-TAN, and the extracted features can be found at Box.

The folder structure should be as follows:

.
├── checkpoints
│   ├── best
│   │    ├── TACoS
│   │    ├── ActivityNet
│   │    └── Charades
├── data
│   ├── TACoS
│   │    ├── tall_c3d_features.hdf5
│   │    └── ...
│   ├── ActivityNet
│   │    ├── sub_activitynet_v1-3.c3d.hdf5
│   │    └── ...
│   ├── Charades-STA
│   │    ├── charades_vgg_rgb.hdf5
│   │    └── ...
│
├── experiments
│
├── lib
│   ├── core
│   ├── datasets
│   └── models
│
└── moment_localization

Train and Test

Please download the visual features from box drive and save it to the data/ folder.

Training

Use the following commands for training:

For TACoS dataset, run:

    sh run_tacos.sh

For ActivityNet-Captions dataset, run:

    sh run_activitynet.sh

For Charades-STA dataset, run:

    sh run_charades.sh

Testing

Our trained model are provided in SJTU jbox or baiduyun, passcode:xmc0 or Google Drive. Please download them to the checkpoints/best/ folder. Use the following commands for testing:

For TACoS dataset, run:

    sh test_tacos.sh

For ActivityNet-Captions dataset, run:

    sh test_activitynet.sh

For Charades-STA dataset, run:

    sh test_charades.sh

Main results:

TACoS	[email protected]	[email protected]	[email protected]	[email protected]
RaNet	43.34	33.54	67.33	55.09

ActivityNet	[email protected]	[email protected]	[email protected]	[email protected]
RaNet	45.59	28.67	75.93	62.97

Charades (VGG)	[email protected]	[email protected]	[email protected]	[email protected]
RaNet	43.87	26.83	86.67	54.22

Charades (I3D)	[email protected]	[email protected]	[email protected]	[email protected]
RaNet	60.40	39.65	89.57	64.54

Acknowledgement

We greatly appreciate the 2D-Tan repository, gtad repository and CCNet repository. Please remember to cite the papers:


@inproceedings{gao2021relation,
  title={Relation-aware Video Reading Comprehension for Temporal Language Grounding},
  author={Gao, Jialin and Sun, Xin and Xu, Mengmeng and Zhou, Xi and Ghanem, Bernard},
  booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
  pages={3978--3988},
  year={2021}
}

@InProceedings{2DTAN_2020_AAAI,
author = {Zhang, Songyang and Peng, Houwen and Fu, Jianlong and Luo, Jiebo},
title = {Learning 2D Temporal Adjacent Networks forMoment Localization with Natural Language},
booktitle = {AAAI},
year = {2020}
} 

@InProceedings{Xu_2020_CVPR,
author = {Xu, Mengmeng and Zhao, Chen and Rojas, David S. and Thabet, Ali and Ghanem, Bernard},
title = {G-TAD: Sub-Graph Localization for Temporal Action Detection},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
url={https://openaccess.thecvf.com/content_CVPR_2020/papers/Xu_G-TAD_Sub-Graph_Localization_for_Temporal_Action_Detection_CVPR_2020_paper.pdf},
month = {June},
year = {2020}
}

@INPROCEEDINGS{9009011,
author={Huang, Zilong and Wang, Xinggang and Huang, Lichao and Huang, Chang and Wei, Yunchao and Liu, Wenyu},
booktitle={2019 IEEE/CVF International Conference on Computer Vision (ICCV)}, 
title={CCNet: Criss-Cross Attention for Semantic Segmentation}, 
year={2019},
volume={},
number={},
pages={603-612},
doi={10.1109/ICCV.2019.00069}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RaNet: Relation-aware Video Reading Comprehension for Temporal Language Grounding

Introduction

Note:

Installation

Data

Train and Test

Training

Testing

Main results:

Acknowledgement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
experiments		experiments
img		img
lib		lib
moment_localization		moment_localization
README.md		README.md
run_activitynet.sh		run_activitynet.sh
run_charades.sh		run_charades.sh
run_tacos.sh		run_tacos.sh
test_activitynet.sh		test_activitynet.sh
test_charades.sh		test_charades.sh
test_tacos.sh		test_tacos.sh

Huntersxsx/RaNet

Folders and files

Latest commit

History

Repository files navigation

RaNet: Relation-aware Video Reading Comprehension for Temporal Language Grounding

Introduction

Note:

Installation

Data

Train and Test

Training

Testing

Main results:

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages