Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM: Move Wx matrix multiplication out of the loop in forward #187

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

antihutka
Copy link
Contributor

Move one of the addmm calls out of the loop and do it in one call across all timesteps. This should provide a significant speedup when running with small batch_size.
I was able to get 10-20% speedup with batch_size=8 when running on CPU, but I'm unable to test it on GPU at the moment.

@dgcrouse
Copy link

I can test GPU execution on CUDA this weekend, can someone check OpenCL?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants