Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify more CUDA and HIP files into common #1516

Merged
merged 17 commits into from
Jul 5, 2024
Merged

Unify more CUDA and HIP files into common #1516

merged 17 commits into from
Jul 5, 2024

Conversation

upsj
Copy link
Member

@upsj upsj commented Dec 19, 2023

A lot of the files we have in common/cuda_hip don't work standalone, but instead require to be included with certain other includes being available. Here I'm trying to change this, allowing the files to be parsed from tooling more easily, and making it easier to write CUDA and HIP code simultaneously.

I did many of these modifications automatically using the unify_cuda_hip.py python script, or using plain text replacement to fix the headers

Some related questions:

  • What should be the canonical place of including hip_runtime.h? I don't think we should have to do that with every file, instead something like config.hip.hpp or types.hip.hpp seems most appropriate
  • Should hip::config be used on host or device or both? If we use it on the device only, we should be able to handle ROCm gfx10 and gfx11 not supported #1429 more easily, the host side should instead be handled with a runtime value, like exec->get_warp_size()

@upsj upsj added 1:ST:WIP This PR is a work in progress. Not ready for review. 1:ST:need-feedback The PR is somewhat ready but feedback on a blocking topic is required before a proper review. labels Dec 19, 2023
@upsj upsj requested a review from a team December 19, 2023 20:27
@upsj upsj self-assigned this Dec 19, 2023
@ginkgo-bot ginkgo-bot added reg:build This is related to the build system. reg:testing This is related to testing. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. reg:benchmarking This is related to benchmarking. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners type:matrix-format This is related to the Matrix formats mod:hip This is related to the HIP module. type:factorization This is related to the Factorizations type:reordering This is related to the matrix(LinOp) reordering reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. type:multigrid This is related to multigrid type:stopping-criteria This is related to the stopping criteria labels Dec 19, 2023
@upsj upsj removed the 1:ST:WIP This PR is a work in progress. Not ready for review. label Dec 20, 2023
@upsj upsj marked this pull request as ready for review December 20, 2023 13:32
Copy link

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
17.8% Duplication on New Code

See analysis details on SonarCloud

@MarcelKoch
Copy link
Member

TBH, I'm not sure if this can be reviewed by humans. 200+ changed files are not easily processed manually. So I guess we have to rely on our CI. If you could provide some key files to look at, that would help a lot.
But in general terms, unifying redundant code sounds good to me. Is the goal to unify as many files as possible and only exclude those, where the actual implementation differs? If so, maybe a list of files that are not suitable would be helpful.

@upsj
Copy link
Member Author

upsj commented Jan 25, 2024

@MarcelKoch Most of these changes are 90% automatical, but the remaining 10% is what's interesting. I'll see what I can do in terms of providing a diff what changed beyond the automated stuff. I reviewed a similar PR in the past using meld's three-way comparison.

@MarcelKoch
Copy link
Member

Perhaps the easiest is if you already know where the 10% interesting stuff is, you could just give us a list of files and then we can manage.

@upsj
Copy link
Member Author

upsj commented Jan 25, 2024

I had to do so many small changes, I can't really remember, I'll have to reproduce that in an automated fashion

@MarcelKoch
Copy link
Member

TBH, I'm not sure if the places where you did manual changes are that important. AFAIK the changes are mostly code movement, with the addition of perhaps some wrappers around the moved code. And I'm somewhat confident that our CI would catch errors on that level. If there are any changes besides that, it would be good to know if you can point us to those.
Otherwise, the question of review is rather on the concept itself and not on the individual changes.

@upsj
Copy link
Member Author

upsj commented Jan 25, 2024

Some of the changes handle inconsistencies between CUDA and HIP versions, some of which might have performance implications, so I don't want to just rely on CI here.

@upsj
Copy link
Member Author

upsj commented Jul 3, 2024

@MarcelKoch I agree, that would be my next goal

Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two left in my mind: preconditioner/jacobi_common.hpp and header guard

#endif


#include <thrust/tuple.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when it becomes the header, it needs the header guard

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file is not self-sufficient without the unified/kernel_launch.hpp header, just like before, so it does not need one.

common/cuda_hip/components/atomic.hpp Outdated Show resolved Hide resolved
common/cuda_hip/components/reduction.hpp Outdated Show resolved Hide resolved
common/cuda_hip/distributed/index_map_kernels.cpp Outdated Show resolved Hide resolved
cuda/preconditioner/jacobi_advanced_apply_kernels.cu Outdated Show resolved Hide resolved
cuda/preconditioner/jacobi_generate_kernels.cu Outdated Show resolved Hide resolved
cuda/preconditioner/jacobi_generate_kernels.instantiate.cu Outdated Show resolved Hide resolved
cuda/preconditioner/jacobi_simple_apply_kernels.cu Outdated Show resolved Hide resolved
@upsj upsj requested a review from yhmtsai July 4, 2024 14:10
@upsj upsj added 1:ST:no-changelog-entry Skip the wiki check for changelog update 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Jul 5, 2024
@upsj upsj force-pushed the unify_cuda_hip branch from 87e2b57 to 391a638 Compare July 5, 2024 07:01
@upsj upsj merged commit 6dc9a6e into develop Jul 5, 2024
12 of 14 checks passed
@upsj upsj deleted the unify_cuda_hip branch July 5, 2024 10:34
Copy link

sonarqubecloud bot commented Jul 5, 2024

MarcelKoch pushed a commit to MarcelKoch/ginkgo that referenced this pull request Dec 2, 2024
This moves most of the CUDA and HIP source files into common/cuda_hip

Related PR: ginkgo-project#1516
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:no-changelog-entry Skip the wiki check for changelog update 1:ST:ready-to-merge This PR is ready to merge. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. reg:benchmarking This is related to benchmarking. reg:build This is related to the build system. reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. reg:testing This is related to testing. type:factorization This is related to the Factorizations type:matrix-format This is related to the Matrix formats type:multigrid This is related to multigrid type:preconditioner This is related to the preconditioners type:reordering This is related to the matrix(LinOp) reordering type:solver This is related to the solvers type:stopping-criteria This is related to the stopping criteria
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants