-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][XPTI] Refactoring framework to use 128-bit keys for collision elimination #14467
base: sycl
Are you sure you want to change the base?
Conversation
+ Add functions to set and enable scope notification of tracepoints when tracepoints are refactored to support self notification Signed-off-by: Vasanth Tovinkere <[email protected]>
+ Moved from std::unordered_map for most containers to emhash8 which improves performance on microbenchmarks from ~1.5X to 6X for various operations. Also improves the performance of callback handlers by ~1.6X + Added 128-bit data structure for implementing the collision free Universal ID Signed-off-by: Vasanth Tovinkere <[email protected]>
+ Framework now is built on 128-bit keys using new APIs and legacy API support is handled with 64-bit mappings to 128-bit keys Signed-off-by: Vasanth Tovinkere <[email protected]>
+ New approach to saving all necessary data in TLS so lookups can be avoided. Lookups take the most amount of time/event. + Moved from emhash8 to phmap as phmap provides both node_hash_map and flat_hash_map and the performance is better than std implementations + Added tests to test APIs and its correctness + Updated documentation for framework implementation and headers Signed-off-by: Vasanth Tovinkere <[email protected]>
+ Basic collector will now handle a few more events Signed-off-by: Vasanth Tovinkere <[email protected]>
+ Revamped implementation to use the same amount of memory, but have the option of reducing the number of lookups + Updated unit tests to reflect the new approach where 128-bit and 64-bit keys are always accessible. Signed-off-by: Vasanth Tovinkere <[email protected]>
+ Updated scomped objects and added new methods that combine UID creation, payload registration and trace point creation into a single call xptiRegisterTracepointScope () or xptiCreateTracepoint(). + Updated tests and addded documentation Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
+ Previous implementation of XPTI required the trace event to be invariant for a given code location. However, with multiple threads being active at the same tracepoint, previous design could not handle mutable metadata. TO assist with this, every visit to a tracepoint generates a trace event. Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
+ New macros to help make instrumenting code easier. A Part 2 PR will migrate all instrumentation using older API to use new helper classes and macros Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
+ basic_event_collection_linux.cpp test was expecting a "kernel_name" metadata for an edge event. That should be covered by the source and target universal IDs. Edges do not have a kernel associated with them. Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
+ Since these hasmaps excel for different types of keys, using both for specific maps gives the best performance. Signed-off-by: Vasanth Tovinkere <[email protected]>
+ Not conrolled by XPTI, but applied clag-format Signed-off-by: Vasanth Tovinkere <[email protected]>
| Trace Point Type | Parameter Description | Metadata | | ||
| :--------------: | :-------------------- | :------- | | ||
| Trace Point Type | Parameter Description | Metadata | | ||
| :--------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------- | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see any change in viewer here but it looks suspicious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KseniyaTikhomirova This is a case of clang-format being applied to the file which tries to modify of MD format is written. Should be fine as it looks fine in the Markdown viewer.
sycl/source/detail/event_impl.cpp
Outdated
|
||
} else { | ||
// If queue.wait() is used, we want to make sure the information about the | ||
// queue is available with the eait events. We check to see if the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mistype, "wait"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
WaitEvent = | ||
(TraceEvent ? static_cast<xpti_td *>(TraceEvent) : GSYCLGraphEvent); | ||
} else | ||
WaitEvent = GSYCLGraphEvent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the following looks a bit more simpler
void *TraceEvent = nullptr;
if (QueueImplPtr Queue = MQueue.lock()) {
TraceEvent = Queue->getTraceEvent();
WaitEvent = (TraceEvent ? static_cast<xpti_td *>(TraceEvent) : GSYCLGraphEvent);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we handle the if(QueueImplPtr Queue = MQueue.lock()) being false case?
// Record the current instance ID for use by Epilog | ||
IId = InstanceID++; | ||
IId = xptiGetUniqueId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correct xptiGetUniqueID() will return same value for every wait call on the same event.
as I see from previous implementation it was counting wait call count here (and it seems to be intentional according to comments about thread-safety and having value per wait call, not per object) and use it as InstanceID. that means that this update brings functional value. To be honest I do not understand if it was needed by someone to count wait calls so this change should be ok. Just highlighting that we changing things here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KseniyaTikhomirova xptiGetUniqueID() returns a new 64-bit unique value. xptiGetUniversalID() will get the same 64-bit value for a given trace point.
sycl/source/detail/queue_impl.hpp
Outdated
@@ -311,16 +259,10 @@ class queue_impl { | |||
// lifetime of the queue object as member variables when ABI breakage is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably old comment should be removed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! It was a stale comment.
|
||
void selfNotifyBegin() { | ||
// selfNotify() is supposed to capture the scope of all the tracepoints we | ||
// have enbled; if we need to limit the notifications only to when the trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same mistype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
void selfNotifyBegin() { | ||
// selfNotify() is supposed to capture the scope of all the tracepoints we | ||
// have enbled; if we need to limit the notifications only to when the trace | ||
// scope has valid TLS data, we can add the follwoing IF block if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same formatting issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above comment to the same issue.
bool MScopedNotify = false; | ||
/// Stores the tracepoint data from creating the tracepoint or from | ||
/// inheritance | ||
xpti_tracepoint_t *MData; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs initialization, not all of these members is covered in constructor. code analyzer will complain about this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
/// The type of the trace event: function_begin, signal, etc | ||
uint16_t MTraceType; | ||
/// The correlation ID for the notification begin/end scope calls | ||
uint64_t MCorrelationId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not initialized in ctor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MTraceType is, fixed MCorrelationId only to fix warnings as it is actually set in the constructor.
@@ -663,6 +1589,7 @@ class tracepoint_t { | |||
m_trace_event = xptiMakeEvent( | |||
m_default_name, const_cast<xpti::payload_t *>(m_payload), | |||
m_default_event_type, m_default_activity_type, &m_instID); | |||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For backward compatibility tests and in the next pass, it will be deprecated.
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Signed-off-by: Vasanth Tovinkere <[email protected]>
Previous implementation of the XPTI framework used 64-bit hash values to represent trace points in the code and this has led to a few of hash collisions.This refactoring moves to a 128-bit key to guarantee uniqueness. The changes needed to SYCL runtime to fully migrate to newer APIs will be pushed as a separate Part 2 pull request. Current pull request include changes to the XPTI framework and minor changes to SYCL runtime to reflect the transition to 128-bit keys and ensure validity of the tests.
NOTE: Since more events are generated due to the creation of a new trace event for each trace point instance, some tests that rely on event sequences may have to be updated.