[SYCL] Use L0 queries for L0 devices #2245

mgouicem · 2024-12-09T10:48:05Z

This PR replaces OCL queries with matching L0 queries for L0 devices on Intel devices.
Few notable things:

Updated L0 headers to access some recent queries
Added headers for Intel specific L0 extensions. These headers come from compute-runtime. We asked about forward compatibility status of these symbols and are waiting for clarification (hence why this PR is in draft)
switched ngen to dynamically load L0

mgouicem · 2024-12-09T10:52:20Z

make test
disable device_cpu
enable device_gpu
enable thr_sycl
enable thr_ocl

src/gpu/intel/jit/ngen/ngen_level_zero.hpp

petercad · 2024-12-10T16:18:31Z

src/gpu/intel/jit/ngen/ngen_level_zero.hpp

+    using zeInit_decl_t = ze_result_t (*)(ze_init_flags_t flags);
+    const ze_init_flags_t default_ze_flags = 0;
+#if defined(__linux__)
+    static const ze_result_t ze_result = reinterpret_cast<zeInit_decl_t>(


You can actually omit the calls to zeInit as the caller must invoke L0 prior to entering nGEN (to locate the device, create a context, etc.).

src/gpu/intel/sycl/l0/utils.cpp

petercad · 2024-12-11T17:13:13Z

src/gpu/intel/sycl/l0/utils.cpp

+            = {ZE_STRUCTURE_INTEL_DEVICE_MODULE_DP_EXP_PROPERTIES};
+    deviceModProps.pNext = &deviceModPropsExt;
+
+    CHECK(func_zeDeviceGetModuleProperties(device, &deviceModProps));


Do we have a driver support matrix? Users may have pretty old drivers (esp. on Linux), so I'm wondering if we need a fallback path here.

For backward compatibility, we validate only last driver available at time of release (see README). In this particular case, this query appeared on Oct 31st 2023 (99abb40a)

For forward compatibility, I am working on getting clarification (and this PR will remain draft in the meantime).

petercad · 2024-12-11T17:17:13Z

src/gpu/intel/sycl/l0/utils.cpp

+            = ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD
+            | ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD;


No GPUs currently have floating point atomic adds in local memory, so we just need to know about global memory FP atomic support:

Suggested change

= ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD

| ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD;

= ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD;

I think a similar change makes sense for load/store and min/max.

I aligned with the native_extension definition that was used so far from OCL queries here (flag is set only if both local and global are available).

I am fine changing the semantic of native_extensions (it seems broken anyway as you mention). But that might have wider implications.

Good point. Then I think it makes sense to keep L0 and OCL aligned and change both in a future PR.

github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Dec 9, 2024

mgouicem requested a review from a team December 9, 2024 10:48

mgouicem force-pushed the mgouicem/main/l0_queries branch 2 times, most recently from e0dc26f to 2c29101 Compare December 9, 2024 12:35

petercad reviewed Dec 9, 2024

View reviewed changes

src/gpu/intel/jit/ngen/ngen_level_zero.hpp Outdated Show resolved Hide resolved

petercad reviewed Dec 9, 2024

View reviewed changes

src/gpu/intel/jit/ngen/ngen_level_zero.hpp Outdated Show resolved Hide resolved

mgouicem force-pushed the mgouicem/main/l0_queries branch from 2c29101 to 6ef7b93 Compare December 10, 2024 09:39

petercad reviewed Dec 10, 2024

View reviewed changes

petercad approved these changes Dec 10, 2024

View reviewed changes

echeresh approved these changes Dec 11, 2024

View reviewed changes

mgouicem added 2 commits December 11, 2024 02:36

gpu:intel:ngen: load level_zero dynamically

ad344e7

gpu:intel:sycl: update level zero headers to 1.19

3836ec7

mgouicem force-pushed the mgouicem/main/l0_queries branch from 6ef7b93 to 916aff8 Compare December 11, 2024 12:41

github-actions bot added the devops Github automation label Dec 11, 2024

gpu:intel:level_zero: add intel extension headers v24.48

aff18b8

mgouicem force-pushed the mgouicem/main/l0_queries branch from 6114957 to aa79d8d Compare December 11, 2024 13:01

mgouicem added 3 commits December 11, 2024 05:26

gpu:intel:sycl: use only l0 queries for l0 devices

c654687

gpu:intel:sycl,ocl: properly propagate status from init_gpu_hw_info

cc56329

ci: fix scope check for same level scopes

9c03e06

mgouicem force-pushed the mgouicem/main/l0_queries branch from aa79d8d to 9c03e06 Compare December 11, 2024 13:26

petercad reviewed Dec 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] Use L0 queries for L0 devices #2245

[SYCL] Use L0 queries for L0 devices #2245

mgouicem commented Dec 9, 2024

mgouicem commented Dec 9, 2024

petercad Dec 10, 2024

petercad Dec 11, 2024

mgouicem Dec 12, 2024

petercad Dec 11, 2024

mgouicem Dec 12, 2024

petercad Dec 12, 2024

		= ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD
		\| ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD;

[SYCL] Use L0 queries for L0 devices #2245

Are you sure you want to change the base?

[SYCL] Use L0 queries for L0 devices #2245

Conversation

mgouicem commented Dec 9, 2024

mgouicem commented Dec 9, 2024

petercad Dec 10, 2024

Choose a reason for hiding this comment

petercad Dec 11, 2024

Choose a reason for hiding this comment

mgouicem Dec 12, 2024

Choose a reason for hiding this comment

petercad Dec 11, 2024

Choose a reason for hiding this comment

mgouicem Dec 12, 2024

Choose a reason for hiding this comment

petercad Dec 12, 2024

Choose a reason for hiding this comment