-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Use L0 queries for L0 devices #2245
base: main
Are you sure you want to change the base?
Conversation
make test |
e0dc26f
to
2c29101
Compare
2c29101
to
6ef7b93
Compare
using zeInit_decl_t = ze_result_t (*)(ze_init_flags_t flags); | ||
const ze_init_flags_t default_ze_flags = 0; | ||
#if defined(__linux__) | ||
static const ze_result_t ze_result = reinterpret_cast<zeInit_decl_t>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can actually omit the calls to zeInit
as the caller must invoke L0 prior to entering nGEN (to locate the device, create a context, etc.).
6ef7b93
to
916aff8
Compare
6114957
to
aa79d8d
Compare
aa79d8d
to
9c03e06
Compare
= {ZE_STRUCTURE_INTEL_DEVICE_MODULE_DP_EXP_PROPERTIES}; | ||
deviceModProps.pNext = &deviceModPropsExt; | ||
|
||
CHECK(func_zeDeviceGetModuleProperties(device, &deviceModProps)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a driver support matrix? Users may have pretty old drivers (esp. on Linux), so I'm wondering if we need a fallback path here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
= ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD | ||
| ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No GPUs currently have floating point atomic adds in local memory, so we just need to know about global memory FP atomic support:
= ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD | |
| ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD; | |
= ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD; |
I think a similar change makes sense for load/store and min/max.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I aligned with the native_extension
definition that was used so far from OCL queries here (flag is set only if both local and global are available).
I am fine changing the semantic of native_extensions
(it seems broken anyway as you mention). But that might have wider implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Then I think it makes sense to keep L0 and OCL aligned and change both in a future PR.
This PR replaces OCL queries with matching L0 queries for L0 devices on Intel devices.
Few notable things: