How to see the speed up on GPU? #27

liaocs2008 · 2018-04-12T17:25:22Z

Please use the caffe-users list for usage, installation, or modeling questions, or other requests for help.
Do not post such requests to Issues. Doing so interferes with the development of Caffe.

Please read the guidelines for contributing before submitting this issue.

Issue summary

I tried to use examples/caffenet_classifier.py but didn't see the speed up on my GPU (GTX 1080). Following are my results:
python examples/caffenet_classifier.py models/bvlc_reference_caffenet/deploy.prototxt caffenet_SSL_0.4469.caffemodel
I0412 13:00:27.347717 29124 base_conv_layer.cpp:855] conv1 group 0: 64.352 us (Dense Scheme Timing)
I0412 13:00:27.347939 29124 base_conv_layer.cpp:855] conv2 group 0: 48.128 us (Dense Scheme Timing)
I0412 13:00:27.348006 29124 base_conv_layer.cpp:855] conv2 group 1: 50.176 us (Dense Scheme Timing)
I0412 13:00:27.348263 29124 base_conv_layer.cpp:855] conv3 group 0: 91.136 us (Dense Scheme Timing)
I0412 13:00:27.348376 29124 base_conv_layer.cpp:855] conv4 group 0: 34.816 us (Dense Scheme Timing)
I0412 13:00:27.348429 29124 base_conv_layer.cpp:855] conv4 group 1: 37.152 us (Dense Scheme Timing)
I0412 13:00:27.348539 29124 base_conv_layer.cpp:855] conv5 group 0: 32.768 us (Dense Scheme Timing)
I0412 13:00:27.348592 29124 base_conv_layer.cpp:855] conv5 group 1: 36.832 us (Dense Scheme Timing)

python examples/caffenet_classifier.py models/bvlc_reference_caffenet/deploy_csrmm.prototxt caffenet_SSL_0.4469.caffemodel
I0411 20:03:28.505044 14113 base_conv_layer.cpp:813] conv1 group 0: 479.008 us (Compressed Row Storage Timing)
I0411 20:03:28.505533 14113 base_conv_layer.cpp:813] conv2 group 0: 286.528 us (Compressed Row Storage Timing)
I0411 20:03:28.505679 14113 base_conv_layer.cpp:813] conv2 group 1: 124.928 us (Compressed Row Storage Timing)
I0411 20:03:28.505998 14113 base_conv_layer.cpp:813] conv3 group 0: 141.312 us (Compressed Row Storage Timing)
I0411 20:03:28.506122 14113 base_conv_layer.cpp:813] conv4 group 0: 38.016 us (Compressed Row Storage Timing)
I0411 20:03:28.506234 14113 base_conv_layer.cpp:813] conv4 group 1: 94.144 us (Compressed Row Storage Timing)
I0411 20:03:28.506362 14113 base_conv_layer.cpp:813] conv5 group 0: 45.152 us (Compressed Row Storage Timing)
I0411 20:03:28.506521 14113 base_conv_layer.cpp:813] conv5 group 1: 140.288 us (Compressed Row Storage Timing)

Steps to reproduce

If you are having difficulty building Caffe or training a model, please ask the caffe-users mailing list. If you are reporting a build error that seems to be due to a bug in Caffe, please attach your build configuration (either Makefile.config or CMakeCache.txt) and the output of the make (or cmake) command.

Your system configuration

Operating system: Ubuntu
Compiler: GCC
CUDA version (if applicable): 8.0
CUDNN version (if applicable): 5.1
BLAS: cuBlas
Python or MATLAB version (for pycaffe and matcaffe respectively): 2.7

wenwei202 · 2018-04-13T17:58:28Z

Random sparse neural networks with crs sparse computation are slow. Please use structured sparsity and conv_mode: LOWERED_CCNMM in our deploy protobuf. More details in tutorials.

liaocs2008 · 2018-04-13T21:02:27Z

Thanks for you reply. The model in this case is the "caffenet_SSL_0.4469.caffemodel", which should be trained with structured sparsity.

I can see cpu speed up (by setting caffe.set_mode_cpu()) using "conv_mode: LOWERED_CCNMM". But I couldn't see speed up on GPU (by setting caffe.set_mode_gpu()) when applying "conv_mode: LOWERED_CSRMM".

Could you help me see the speed up on GPU? Thanks in advance.

jiaqun123 · 2018-05-14T13:07:56Z

@wenwei202 @liaocs2008
I would like to know whether the command ‘’./build/tools/caffe time -model=xx/xx.protoxtx -weights=xx/xx.caffemodel‘’ can get test time and compute speeded-up.

liaocs2008 · 2018-05-14T14:55:48Z

@jiaqun123 Using "caffe time" command measures the end to end time of a layer. But I think in this implementation, @wenwei202 measures the time only for matrix multiplication. You can check the python script "examples/caffenet_classifier.py" to measure speedup.

jiaqun123 · 2018-05-15T01:35:39Z

@liaocs2008
Thanks for your reply。I tried to use examples/cifar10_classifier.py
python examples/cifar10_classifier.py models/cifar10/cifar10_full.prototxt cifar10_full_iter_200000.caffemodel. But I get the the following result and didn't get the time for each conv layer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to see the speed up on GPU? #27

How to see the speed up on GPU? #27

liaocs2008 commented Apr 12, 2018

wenwei202 commented Apr 13, 2018

liaocs2008 commented Apr 13, 2018

jiaqun123 commented May 14, 2018

liaocs2008 commented May 14, 2018

jiaqun123 commented May 15, 2018

How to see the speed up on GPU? #27

How to see the speed up on GPU? #27

Comments

liaocs2008 commented Apr 12, 2018

Issue summary

Steps to reproduce

Your system configuration

wenwei202 commented Apr 13, 2018

liaocs2008 commented Apr 13, 2018

jiaqun123 commented May 14, 2018

liaocs2008 commented May 14, 2018

jiaqun123 commented May 15, 2018