Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

low rank deep neural networks #8

Open
wenwei202 opened this issue May 13, 2017 · 7 comments
Open

low rank deep neural networks #8

wenwei202 opened this issue May 13, 2017 · 7 comments

Comments

@wenwei202
Copy link
Owner

wenwei202 commented May 13, 2017

Issue summary

I am working on low rank deep neural networks, to speedup the testing for better deployability. Anyone is working on similar stuff?

Steps to reproduce

Code is in https://github.com/wenwei202/caffe/tree/sfm.
Related publication in ICCV 2017

@bachml
Copy link

bachml commented May 31, 2017

Yes.
But I didn't find anything new in the sfm branch. Would you mind offering some documents about this branch? Thanks

@wenwei202
Copy link
Owner Author

@bachml We have a simple tutorial on usage of this code. We will update more details recently.

@bachml
Copy link

bachml commented Jun 14, 2017

@wenwei202 thanks to your remarkable work.
But there's a problem comes up. There's no speedups was observed when a convolution layer with rank M = 1 (high layer in ResNet) was decomposed. Also I didn't find any experiment about speedups of ResNet by force regularization.
Did you meet this issue?

@wenwei202
Copy link
Owner Author

@bachml I did not measure speedup by ResNet. Decomposing to rank 1 should have some benefits. It is the issue of the implementation?

@bachml
Copy link

bachml commented Jun 15, 2017

@wenwei202 More test in my baseline case(a 27 layers ResNet) shows that the issue is related to multi-threaded blas performance (Caffe with CPU).
It did has 1.20x speedup, which is not significant, with 8 threads openblas backend. When run with single-threaded openblas, it has 2.00x speedup. I guess it's because rank 1 convolution corresponds small matrix multiplication, which do not have much benefits from multi-threaded operation with im2col trick.

@wenwei202
Copy link
Owner Author

@bachml In the rank one case, the conv layer is decomposed to a conv layer with only one filter plus a linear combination layer which essentially is a conv layer with kernels of 1x1. Some code optimization may be required to fully exploit this kind of compactness.

@wenwei202
Copy link
Owner Author

In case you still have interest in this research topic, the details are covered in the paper which is just accepted by ICCV 2017.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants