-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loop tilling/blocking in SCA forward transformation #19
Comments
@mlange05 Do you have an example of how you would like this to be? |
Ok, this is just a preliminary sketch, but basically what I would like for the SCA case is to add an option to insert a single level of blocking around the root call. Something like:
then transforms into something like
The details of how to define the One key detail to note then is that the OpenMP-parallel loop would live on the outer loop in this calling routine (to avoid false sharing), and would need to be removed from all kernels under this root call for the C backend. |
@mlange05 Thanks for the input. I will try to draft something in the document and we can iterate on it. |
@mlange05 Is it fine if this notion is tied with the Single Column Abstraction or would you see a use case where you would used this loop tilling/blocking as low-level transformation like a |
Hmmm, good question. My primary use case is the root loop of SCA, but I can see how making it a general loop transformation makes conceptually more sense. Is there a way to combine/compose the two, eg. the SCA transformation triggering the loop transformation automatically if given a particular keyword? That being said, if it's easier to do as part of SCA I would be very happy already. |
This would be possible. In the |
PR #24 |
Support for loop-tiling/blocking
The text was updated successfully, but these errors were encountered: