Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL]Fix SYCL target triple check for NVidia targets. #14505

Open
wants to merge 4 commits into
base: sycl
Choose a base branch
from

Conversation

srividya-sundaram
Copy link
Contributor

@srividya-sundaram srividya-sundaram commented Jul 9, 2024

-fsycl-targets enables AOT compilations for specified device targets.
For NVidia GPU targets, per oneAPI DPCPP documentation, the valid target triple string is nvptx64-nvidia-cuda
This PR, checks if the target triple string provided for NVidia targets is a valid one.

Example:

clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-invalidEnv syclfile.cpp
clang++: error: SYCL target is invalid: 'nvptx64-nvidia-cuda-invalidEnv'

static bool isValidSYCLTriple(llvm::Triple TargetTriple) {
// Check for invalid SYCL device triple values for NVidia GPU targets.
// nvptx64-nvidia-cuda is the valid triple for NVidia targets.
if (TargetTriple.getArchTypeName(TargetTriple.getArch()) == "nvptx64" &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (TargetTriple.getArchTypeName(TargetTriple.getArch()) == "nvptx64" &&
if (TargetTriple.isNVPTX() &&

Why do we need to do string comparisons?

// Check for invalid SYCL device triple values for NVidia GPU targets.
// nvptx64-nvidia-cuda is the valid triple for NVidia targets.
if (TargetTriple.getArchTypeName(TargetTriple.getArch()) == "nvptx64" &&
TargetTriple.getVendorTypeName(TargetTriple.getVendor()) == "nvidia" &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, I would prefer to avoid string comparison. Target.getVendor() == Triple::VendorType::NVIDIA should be enough

.StartsWith("rtems", Triple::RTEMS)
.StartsWith("nacl", Triple::NaCl)
.StartsWith("aix", Triple::AIX)
.Case("cuda", Triple::CUDA)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to accept this change at intel/llvm. If you can get this accepted to the upstream LLVM, then it is fine.

This change is at least inconsistent with handling of other OSes. For example, clang will still accept nvptx-nvidia-nvclmyown as a valid triple for OpenCL on NVIDIA, even though the spelling contains some extra myown.

My main concern is that no one prohibits using other things from intel/llvm besides SYCL (and I know for sure that folks are using OpenMP for example) and therefore this change could break someone's workflow that we don't know about. Since we don't mark our customizations to the upstream codebase in any way, this change could easily get lost, cause conflicts or be hard to find during debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @AlexeySachkov

Currently, when we pass an invalid target triple value to fsycl-targets for NVidia targets , we get the following error:

fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.implicit.offset
llvm-foreach:
clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation)

This error is non-intuitive and does not clearly describe what the problem is.

Currently, in the driver code base, there is no check to verify if the target triple string passed matches the exact value documented - nvptx64-nvidia-cuda
We only check if the ‘architecture’ component has ‘nvptx64’ and return that as a valid string.
Example: nvptx64-badVendor-badOS would still be passed as a valid triple to the toolchain and we end up getting an error.

For OpenMP, they do allow partial/shortened triple strings with empty Vendor or/and OS components. The following triples are all valid. Please see this function
nvptx64
nvptx64-nvidia
nvptx64--cuda

As for this change: .Case("cuda", Triple::CUDA)
When we just check using StartsWith , for an example like this - nvptx64-nvidia-cuda123,
the 123 get parsed as the OS Major version but the triple still gets rejected because cuda123 is not a valid OS component for offloading SYCL to nvptx backend.

I can remove the string comparisons as parseArch() , parseVendor() and parseOS() all use .Case match for Nvidia targets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best we can do is check for Triple::CUDA for the OS and not try to be strict about it. Sure we would be allowing OS values of cudaBadString, but given how doing so is consistent with other OS settings I'm hard pressed to deviate. If the failure occurs down the line when using cudaBadString, there may be an issue later in the compilation when using the triple that isn't using the known OS check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is non-intuitive and does not clearly describe what the problem is.

I agree, but I still don't think that the suggested customization is right in context of deviation from the community code, further maintenance effort and potential unexpected/unwanted side effects.

For OpenMP, they do allow partial/shortened triple strings with empty Vendor or/and OS components.

We should adopt this in sycl and it will automatically reduce the amount of uses of longer triple that is prone to errors. Moreover, -fsycl-targets will eventually move to --arch that accepts a target name and not a triple, resolving this issue all together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants