Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of CUDA-ised transform #602
base: main
Are you sure you want to change the base?
Implementation of CUDA-ised transform #602
Changes from 11 commits
69f2f24
5db4d18
f51fb40
c43d99d
7cfc88f
9b3dafb
9c522f3
aa78bd2
f4e99b0
2d3d6c0
49afcd2
d57e0a0
e08cf04
a5c8d5e
a827e76
5e1c6c9
df6824c
03a12d0
c06cc34
8d9c049
52e4f8f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vchuravy is there a way to apply a cuFFT plan such that the result is directly written into some pre-allocated array? Instead of the allocation here on the right side and then writing it into the view of an array on the left?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackleland because at the moment it sounds we would just need an
rfft!
/brfft!
method that calls this line, either written withLinearAlgebra.mul!
for CPUs or with something else for GPUs, so the_fourier_batched!
function is actually the same and we wouldn't need any code duplication here, just a new method forrfft!
,brfft!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function assumes 1-based m, lmax inputs. I uses the
ij2k
function from LowerTriangularMatrices to transform between i, j indices of a matrix (here l, m) to the running indexk
as it's called in that module (here lm).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change hasn't been pushed yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what change you're referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where it is actually called from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's invoked by the
@cuda
macro, so lines 95 (and 109)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is bad, I'll change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to be adapted (+comment maybe!) for 0-based m, lmax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed this to
kernel