+ Feb 2020: We are working on moving code from this folder to scenarios\action_recognition.
+ While this work is ongoing, please visit both locations for implementations and documentation.
This directory contains resources for building video-based action recognition systems.
Action recognition (also known as activity recognition) consists of classifying various actions from a sequence of frames:
We implemented two state-of-the-art approaches: (i) I3D and (ii) R(2+1)D. This includes example notebooks for e.g. scoring of webcam footage or fine-tuning on the HMDB-51 dataset.
We recommend to use the R(2+1)D model for its competitive accuracy, fast inference speed, and less dependencies on other packages. For both approaches, using our implementations, we were able to reproduce reported accuracies:
Model | Reported in the paper | Our results |
---|---|---|
R(2+1)D-34 RGB | 79.6% | 79.8% |
I3D RGB | 74.8% | 73.7% |
I3D Optical flow | 77.1% | 77.5% |
I3D Two-Stream | 80.7% | 81.2% |
Directory | Description |
---|---|
r2p1d | Scripts for fine-tuning a pre-trained R(2+1)D model on HMDB-51 dataset |
i3d | Scripts for fine-tuning a pre-trained I3D model on HMDB-51 dataset |