Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loop tilling/blocking in SCA forward transformation #19

Open
clementval opened this issue Jun 21, 2018 · 7 comments
Open

loop tilling/blocking in SCA forward transformation #19

clementval opened this issue Jun 21, 2018 · 7 comments

Comments

@clementval
Copy link
Contributor

clementval commented Jun 21, 2018

Support for loop-tiling/blocking

@clementval
Copy link
Contributor Author

@mlange05 Do you have an example of how you would like this to be?

@clementval clementval modified the milestones: Specification 1.0, v2.0 Sep 28, 2018
@mlange05
Copy link

Ok, this is just a preliminary sketch, but basically what I would like for the SCA case is to add an option to insert a single level of blocking around the root call. Something like:

!$claw parallelize forward blocked="bsize"
DO p = 1, nproma     
  CALL compute_column(nz, q(p,:), t(p,:), z(p,:))                                                                                             
END DO 

then transforms into something like

p_blocks = <number of blocks as a function of "bsize">                                                                                                                                                                                                                                                 
DO p_i = 1, p_blocks                                                                                                                                                                                                                                                          
   p0 = <first index of block in global array>                                                                                                                                                                                                                                
   p1 = <final index of block in global array>                                                                                                                                                                                                                                
   CALL compute_column ( nz , q (p0:p1 , : ) , t (p0:p1, : ) , z (p0:p1, : ), nproma=p1-p0 )   
END DO

The details of how to define the bsize variable in the pragma or the derivation of block sizes/indices are still a bit fuzzy (sorry), but I'll provide a more concrete example soon...

One key detail to note then is that the OpenMP-parallel loop would live on the outer loop in this calling routine (to avoid false sharing), and would need to be removed from all kernels under this root call for the C backend.

@clementval
Copy link
Contributor Author

@mlange05 Thanks for the input. I will try to draft something in the document and we can iterate on it.

@clementval clementval changed the title loop tilling/blocking loop tilling/blocking in SCA forward transformation Oct 2, 2018
@clementval clementval added SCA and removed LOW-LEVEL labels Oct 2, 2018
@clementval
Copy link
Contributor Author

@mlange05 Is it fine if this notion is tied with the Single Column Abstraction or would you see a use case where you would used this loop tilling/blocking as low-level transformation like a loop-fusion?

@mlange05
Copy link

mlange05 commented Oct 2, 2018

Hmmm, good question. My primary use case is the root loop of SCA, but I can see how making it a general loop transformation makes conceptually more sense. Is there a way to combine/compose the two, eg. the SCA transformation triggering the loop transformation automatically if given a particular keyword?

That being said, if it's easier to do as part of SCA I would be very happy already.

@clementval
Copy link
Contributor Author

This would be possible. In the loop-interchange directive for example, the fusion clause trigger a fusion transformation automatically.

@clementval
Copy link
Contributor Author

PR #24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants