-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Testing some par_dispatch
stuff
#1156
base: develop
Are you sure you want to change the base?
Conversation
Testing vectorization using this branch in Riot with The run time difference for a 3D AMR test problem between these two branches is negligible, the old one wins by maybe a couple of percent but I think the run to run variance is larger than that. |
PR Summary
This is just an attempt to see if some the template magic in #1142 could be written in a little different way. Basically just copies the ideas there but structures the code differently. Seems to be working both on cpu and on device.
IndexRange
s defining the loop bounds. This is enabled byLoopBoundTranslator
.LoopPatternTPTTR
which requires at least rank 2)FlatRange
).ThreadVectorRange
as an option forpar_for_inner
PR Checklist