Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to see the speed up on GPU? #27

Open
liaocs2008 opened this issue Apr 12, 2018 · 5 comments
Open

How to see the speed up on GPU? #27

liaocs2008 opened this issue Apr 12, 2018 · 5 comments

Comments

@liaocs2008
Copy link

Please use the caffe-users list for usage, installation, or modeling questions, or other requests for help.
Do not post such requests to Issues. Doing so interferes with the development of Caffe.

Please read the guidelines for contributing before submitting this issue.

Issue summary

I tried to use examples/caffenet_classifier.py but didn't see the speed up on my GPU (GTX 1080). Following are my results:
python examples/caffenet_classifier.py models/bvlc_reference_caffenet/deploy.prototxt caffenet_SSL_0.4469.caffemodel
I0412 13:00:27.347717 29124 base_conv_layer.cpp:855] conv1 group 0: 64.352 us (Dense Scheme Timing)
I0412 13:00:27.347939 29124 base_conv_layer.cpp:855] conv2 group 0: 48.128 us (Dense Scheme Timing)
I0412 13:00:27.348006 29124 base_conv_layer.cpp:855] conv2 group 1: 50.176 us (Dense Scheme Timing)
I0412 13:00:27.348263 29124 base_conv_layer.cpp:855] conv3 group 0: 91.136 us (Dense Scheme Timing)
I0412 13:00:27.348376 29124 base_conv_layer.cpp:855] conv4 group 0: 34.816 us (Dense Scheme Timing)
I0412 13:00:27.348429 29124 base_conv_layer.cpp:855] conv4 group 1: 37.152 us (Dense Scheme Timing)
I0412 13:00:27.348539 29124 base_conv_layer.cpp:855] conv5 group 0: 32.768 us (Dense Scheme Timing)
I0412 13:00:27.348592 29124 base_conv_layer.cpp:855] conv5 group 1: 36.832 us (Dense Scheme Timing)

python examples/caffenet_classifier.py models/bvlc_reference_caffenet/deploy_csrmm.prototxt caffenet_SSL_0.4469.caffemodel
I0411 20:03:28.505044 14113 base_conv_layer.cpp:813] conv1 group 0: 479.008 us (Compressed Row Storage Timing)
I0411 20:03:28.505533 14113 base_conv_layer.cpp:813] conv2 group 0: 286.528 us (Compressed Row Storage Timing)
I0411 20:03:28.505679 14113 base_conv_layer.cpp:813] conv2 group 1: 124.928 us (Compressed Row Storage Timing)
I0411 20:03:28.505998 14113 base_conv_layer.cpp:813] conv3 group 0: 141.312 us (Compressed Row Storage Timing)
I0411 20:03:28.506122 14113 base_conv_layer.cpp:813] conv4 group 0: 38.016 us (Compressed Row Storage Timing)
I0411 20:03:28.506234 14113 base_conv_layer.cpp:813] conv4 group 1: 94.144 us (Compressed Row Storage Timing)
I0411 20:03:28.506362 14113 base_conv_layer.cpp:813] conv5 group 0: 45.152 us (Compressed Row Storage Timing)
I0411 20:03:28.506521 14113 base_conv_layer.cpp:813] conv5 group 1: 140.288 us (Compressed Row Storage Timing)

Steps to reproduce

If you are having difficulty building Caffe or training a model, please ask the caffe-users mailing list. If you are reporting a build error that seems to be due to a bug in Caffe, please attach your build configuration (either Makefile.config or CMakeCache.txt) and the output of the make (or cmake) command.

Your system configuration

Operating system: Ubuntu
Compiler: GCC
CUDA version (if applicable): 8.0
CUDNN version (if applicable): 5.1
BLAS: cuBlas
Python or MATLAB version (for pycaffe and matcaffe respectively): 2.7

@wenwei202
Copy link
Owner

Random sparse neural networks with crs sparse computation are slow. Please use structured sparsity and conv_mode: LOWERED_CCNMM in our deploy protobuf. More details in tutorials.

@liaocs2008
Copy link
Author

Thanks for you reply. The model in this case is the "caffenet_SSL_0.4469.caffemodel", which should be trained with structured sparsity.

I can see cpu speed up (by setting caffe.set_mode_cpu()) using "conv_mode: LOWERED_CCNMM". But I couldn't see speed up on GPU (by setting caffe.set_mode_gpu()) when applying "conv_mode: LOWERED_CSRMM".

Could you help me see the speed up on GPU? Thanks in advance.

@jiaqun123
Copy link

@wenwei202 @liaocs2008
I would like to know whether the command ‘’./build/tools/caffe time -model=xx/xx.protoxtx -weights=xx/xx.caffemodel‘’ can get test time and compute speeded-up.

@liaocs2008
Copy link
Author

@jiaqun123 Using "caffe time" command measures the end to end time of a layer. But I think in this implementation, @wenwei202 measures the time only for matrix multiplication. You can check the python script "examples/caffenet_classifier.py" to measure speedup.

@jiaqun123
Copy link

@liaocs2008
Thanks for your reply。I tried to use examples/cifar10_classifier.py
python examples/cifar10_classifier.py models/cifar10/cifar10_full.prototxt cifar10_full_iter_200000.caffemodel. But I get the the following result and didn't get the time for each conv layer.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants