Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][NVPTX] Emit reqd_work_group_size attributes as NVVM annotations #14502

Merged
merged 3 commits into from
Jul 16, 2024

Conversation

frasercrmck
Copy link
Contributor

Only emit the provided values as annotations in the LLVM IR. The NVPTX backend will pad missing values with 1s. This suits the fact that the attribute must provide as many values as the dimensionality of the work-group, and we can assume that the work-group size of unused dimensions is 1.

Only emit the provided values as annotations in the LLVM IR. The NVPTX
backend will pad missing values with 1s. This suits the fact that the
attribute must provide as many values as the dimensionality of the
work-group, and we can assume that the work-group size of unused
dimensions is 1.
Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks ok to me but this attribute is handled differently for different backends now and I do not know if that is ok (#13600). @premanandrao @steffenlarsen please weigh in here.

@frasercrmck
Copy link
Contributor Author

The code looks ok to me but this attribute is handled differently for different backends now and I do not know if that is ok (#13600). @premanandrao @steffenlarsen please weigh in here.

It's not ideal, no. I do think NVPTX.cpp is a good place to add NVPTX-specific metadata, and CodeGenFunction.cpp is a good place to add target-independent metadata.

We do need this NVPTX-specific lowering as the NVVM annotations are the only way the NVPTX backend is going to make use of the attributes - it's just ignored otherwise. I think the most regretful outcome if that we have the same information repeated in multiple different ways: see also how we are already handling [[intel::max_work_group_size]] attribute this way. For example, intel::max_work_group_size(4, 8, 16) gives us:

define  void @foo() max_work_group_size !28 {
   ...
}

!nvvm.annotations = !{19, !20, !21}

!19 = !{ptr @_ZTSZZ4mainENKUlRN4sycl3_V17handlerEE_clES2_E1C, !"maxntidx", i32 16}
!20 = !{ptr @_ZTSZZ4mainENKUlRN4sycl3_V17handlerEE_clES2_E1C, !"maxntidy", i32 8}
!21 = !{ptr @_ZTSZZ4mainENKUlRN4sycl3_V17handlerEE_clES2_E1C, !"maxntidz", i32 4}

!28 = !{i32 16, i32 8, i32 4}

Perhaps one way of making it nicer would be to lower from Attribute -> Function Metadata in CodeGenFunction.cpp, then lower the Function Metadata to NVVM annotations in NVPTX.cpp. That may give a clearer lowering path, separating out responsibilities.

@frasercrmck frasercrmck deleted the sycl-nvptx-reqd-wg-size branch July 11, 2024 10:11
@frasercrmck frasercrmck restored the sycl-nvptx-reqd-wg-size branch July 11, 2024 10:14
@frasercrmck frasercrmck reopened this Jul 11, 2024
@frasercrmck
Copy link
Contributor Author

Sorry, accidentally deleted the branch

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I attempted to do this a long time ago (#3755), but there were problems due to us padding the dimensions of the attribute at the time. Lovely to see it finally happening. 😄

I am not too concerned about it taking a new path for NVPTX and generating special NVVM metadata. From what I remember, the current behavior for the attribute is heavily based on the SPIR-V translator consuming the metadata that the OpenCL-related attribute set a precedence for.

An alternative could be to have a late-stage LLVM transformation pass for consuming the "reqd_work_group_size" metadata and generating these NVVM attributes. A benefit of this is that we could use this for handling "sycl-work-group-size" produced by the SYCL compile-time property alternative for this attribute.

clang/lib/CodeGen/Targets/NVPTX.cpp Show resolved Hide resolved
@premanandrao
Copy link
Contributor

The code looks ok to me but this attribute is handled differently for different backends now and I do not know if that is ok (#13600). @premanandrao @steffenlarsen please weigh in here.

It's not ideal, no. I do think NVPTX.cpp is a good place to add NVPTX-specific metadata, and CodeGenFunction.cpp is a good place to add target-independent metadata.

We do need this NVPTX-specific lowering as the NVVM annotations are the only way the NVPTX backend is going to make use of the attributes - it's just ignored otherwise. I think the most regretful outcome if that we have the same information repeated in multiple different ways: see also how we are already handling [[intel::max_work_group_size]] attribute this way. For example, intel::max_work_group_size(4, 8, 16) gives us:

define  void @foo() max_work_group_size !28 {
   ...
}

!nvvm.annotations = !{19, !20, !21}

!19 = !{ptr @_ZTSZZ4mainENKUlRN4sycl3_V17handlerEE_clES2_E1C, !"maxntidx", i32 16}
!20 = !{ptr @_ZTSZZ4mainENKUlRN4sycl3_V17handlerEE_clES2_E1C, !"maxntidy", i32 8}
!21 = !{ptr @_ZTSZZ4mainENKUlRN4sycl3_V17handlerEE_clES2_E1C, !"maxntidz", i32 4}

!28 = !{i32 16, i32 8, i32 4}

Perhaps one way of making it nicer would be to lower from Attribute -> Function Metadata in CodeGenFunction.cpp, then lower the Function Metadata to NVVM annotations in NVPTX.cpp. That may give a clearer lowering path, separating out responsibilities.

Thanks for the explanation. I don't have a strong opinion on changing what you have, so I will approve this. But @elizabethandrews might, but she is out until at least Monday.

@frasercrmck
Copy link
Contributor Author

Great! I attempted to do this a long time ago (#3755), but there were problems due to us padding the dimensions of the attribute at the time. Lovely to see it finally happening. 😄

I am not too concerned about it taking a new path for NVPTX and generating special NVVM metadata. From what I remember, the current behavior for the attribute is heavily based on the SPIR-V translator consuming the metadata that the OpenCL-related attribute set a precedence for.

An alternative could be to have a late-stage LLVM transformation pass for consuming the "reqd_work_group_size" metadata and generating these NVVM attributes. A benefit of this is that we could use this for handling "sycl-work-group-size" produced by the SYCL compile-time property alternative for this attribute.

No problem! I wasn't aware you'd already tried it, sorry.

Yeah I agree the current lowering for NVPTX (and I'm guessing AMDGPU) leaves something to be desired. I think I'll look into this in a subsequent patch, if that's alright?

@steffenlarsen
Copy link
Contributor

No problem! I wasn't aware you'd already tried it, sorry.

Don't be sorry! It was just to say that I think this is the right direction and it's great to see it actually happening. 😄

Yeah I agree the current lowering for NVPTX (and I'm guessing AMDGPU) leaves something to be desired. I think I'll look into this in a subsequent patch, if that's alright?

You could, but I figure it will involve moving the transformation done here to a later stage to handle both the property and the attribute the same. I don't see a problem in that though, but I'll let @elizabethandrews and @premanandrao have the final say.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I have been battling covid and working reduced hours. This PR LGTM.

My uneasiness has more to do with these attributes in general and not your implementation.

Copy link
Contributor

@smanna12 smanna12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@frasercrmck
Copy link
Contributor Author

This PR looks ready to merge, @intel/llvm-gatekeepers - thanks!

@sommerlukas sommerlukas merged commit fe18590 into intel:sycl Jul 16, 2024
27 checks passed
@frasercrmck frasercrmck deleted the sycl-nvptx-reqd-wg-size branch July 16, 2024 06:43
smanna12 pushed a commit to smanna12/llvm that referenced this pull request Jul 16, 2024
intel#14502)

Only emit the provided values as annotations in the LLVM IR. The NVPTX
backend will pad missing values with 1s. This suits the fact that the
attribute must provide as many values as the dimensionality of the
work-group, and we can assume that the work-group size of unused
dimensions is 1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants