简体中文 | English
We compare our results with some popular frameworks and official releases in terms of speed.
- 8 NVIDIA Tesla V100 (16G) GPUs
- Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
- Python 3.7
- PaddlePaddle2.0
- CUDA 10.1
- CUDNN 7.6.3
- NCCL 2.1.15
- GCC 8.2.0
The statistic is the average training time, including data processing and model training time, and the training speed is measured with ips(instance per second). Note that we skip the first 50 iters as they may contain the device warmup time.
Here we compare PaddleVideo with the other video understanding toolkits in the same data and model settings.
To ensure the fairness of the comparison, the comparison experiments were conducted under the same hardware environment and using the same dataset. The dataset we used is generated by the data preparation, and in each model setting, the same data preprocessing methods are applied to make sure the same feature input.
Significant improvement can be observed when comparing with other video understanding framework as shown in the table below, Especially the Slowfast model is nearly 2x faster than the counterparts.
Model | batch size x gpus | PaddleVideo(ips) | Reference(ips) | MMAction2 (ips) | PySlowFast (ips) |
---|---|---|---|---|---|
TSM | 16x8 | 58.1 | 46.04(temporal-shift-module) | To do | X |
PPTSM | 16x8 | 57.6 | X | X | X |
TSN | 16x8 | 841.1 | To do (tsn-pytorch) | To do | X |
Slowfast | 16x8 | 99.5 | X | To do | 43.2 |
Attention_LSTM | 128x8 | 112.6 | X | X | X |
Model | PaddleVideo(ips) | MMAction2 (ips) | BMN(boundary matching network) (ips) |
---|---|---|---|
BMN | 43.84 | x | x |